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Amendments to the Claims : 

The following listing of claims will replace all prior versions, and listings, of claims in 
the application: 

1 . (Currently Amended) A processor implemented method of identifying a 
document type of a document in machine-readable form without structurally analyzing the 
document text, the processor implemented method comprising: 

a) selecting a first set of nonstructural surface cues; 

ab) generating a cue vector from the text, the cue vector having a valve for 
each of the selected cues and representing a frequency of occurrences in the text of a rthe first 
set of nonstructural, nonstructural surface cues; and 

c) associating a weighing vector with a first text genre; and 

bd) determining whether the text is an instance of a -the first text genre 

using the cue vector and a -the w eighting vector associated with the first text genre, 
wherein the first set of cues includes a punctuational cue. 

2. (Canceled) 

3. (Previously Presented) The method of claim 1, wherein the punctuational cue 
represents a one of a number of commas in the text, a number of dashes in the text, a number 
of question marks in the text and a number of semi-colons in the text. 

4. (Previously Presented) The method of claim 1, wherein the first set of cues 
includes a string recognizable constructional cue. 

5. (Previously Presented) The method of claim 4, wherein the string 
recognizable constructional cue represents a one of a first number of sentences starting with 
the words "and", "but" and "so" and a second number of sentences starting with an adverb and 
a comma. 
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6. (Previously Presented) The method of claim 1, wherein the first set of cues 
includes a formulae cue. 

7. (Previously Presented) The method of claim 1, wherein the first set of cues 
includes a lexical cue. 

8. (Previously Presented) The method of claim 7, wherein the lexical cue 
represents a one of a first number of occurrences in the text of acronyms, a second number of 
occurrences in the text of modal auxiliaries, a third number of occurrences of form of the verb 
"be", and a fourth number of occurrences of calendar words. 

9. (Previously Presented) The method of claim 7, wherein the lexical cue 
represents a one of a first number of occurrences in the text of capitalized words, a second 
number of occurrences in the text of contractions, a third number of occurrences in the text of 
words that end in "ed", and a fourth number of occurrences in the text of mathematical 
formulas. 

10. (Previously Presented) The method of claim 7, wherein the lexical cue 
represents a one of a first number of occurrences in the text of polysyllabic words, a second 
number of occurrences in the text of the word "it", a third number of occurrences in the text 
of latinate prefixes and suffixes, and a fourth number of occurrences in the text of overt 
negatives. 

1 1 . (Previously Presented) The method of claim 7, wherein the lexcial cue 
represents a one of a first number of occurrences in the text of words including at least one 
digit, a second number of occurrences in the text of left parenthesis, a third number of 
occurrences in the text of prepositions, a fourth number of occurrences in the text of first 
person pronouns, and a fifth number of occurrences in the text of second person pronouns. 

12. (Previously Presented) The method of claim 7, wherein the lexical cue 
represents a one of a first number of occurrences in the text of quotation marks, a second 
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number of occurrences in the text of roman numerals, a third number of occurrences in the 
text of "that", and a fourth number of occurrences in the text of "which". 

13. (Previously Presented) The method of claim 1, wherein the first set of cues 
includes a deviation cue. 

14. (Previously Presented) The method of claim 13, wherein the deviation cue 
includes a one of a first deviation of a sentence length of the text and a second deviation of a 
word length of the text. 

1 5 . (Previously Presented) The method of claim 3, wherein the first set of cues 
further includes a second set of lexical cues, a third set of string recognizable constructional 
cues, a fourth set of formulae cues and fifth set of deviation cues. 

16. (Previously Presented) The method of claim 15, wherein the second set of 
lexical cues includes at least a one lexical cue representing a one of a first number of 
occurrences in the text of acronyms, a second number of occurrences in the text of modal 
auxiliaries, a third number of occurrences of form of the verb "be", a fourth number of 
occurrences of calendar words, a fifth number of occurrences in the text of capitalized words, 
a sixth number of occurrences in the text of contractions, a seventh number of occurrences in 
the text of words that end in "ed, an eighth number of occurrences in the text of mathematical 
formulas, a ninth number of occurrences in the text of polysyllabic words, a tenth number of 
occurrences in the text of the word "it", an eleventh number of occurrences in the text of 
Latinate prefixes and suffixes, a twelfth number of occurrences in the text of overt negatives, 
a thirteenth number of occurrences in the text of words including at least one digit, a 
fourteenth number of occurrences in the text of parenthesis, a fifteenth number of occurrences 
in the text of prepositions, a sixteenth number of occurrences in the text of first person 
pronouns, a seventeenth number of occurrences in the text of second person pronouns, an 
eighteenth number of occurrences in the text of quotation marks, a nineteenth number of 
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occurrences in the text of roman numerals, a twentieth number of occurrences in the text of 
"that", and a twenty-first number of occurrences in the text of "which". 

17. (Previously Presented) The method of claim 15, wherein the third set of string 
recognizable constructional cues includes at least one string recognizable constructional cue 
representing a one of a first number of sentences starting with the words "and", "but" and "so" 
and a second number of sentences starting with an adverb and a comma. 

18. (Previously Presented) The method of claim 15, wherein the fifth set of 
deviation cues includes at least one deviation cue representing a one of a first deviation of a 
sentence length of the text and a second deviation of a word length of the text. 

19. (Currently Amended) A processor implemented method of identifying a 
document type of a document in machine-readable form without structurally analyzing the 
document text, the processor implemented method comprising the steps of: 
a) selecting a first set of nonstructural surface cues; 

ab) generating a cue vector from the text, the cue vecto r having a valve for 
each of the selected cues and representing a frequency of occurrences in the text of a -the first 
set of nonstructuraL nonstructural surface cues; 

be) determining a relevancy to the text of each facet of a second set of 
facets using the cue vector and a weighting vector; and 

(ed) identifying from a third set of document types a document type of the 
text based upon those facets of the second set that are relevant to the text, 

wherein the first set of cues includes a punctuational cue. 

20. (Canceled) 

21. (Previously Presented) The method of claim 19, wherein the first set of cues 
includes one of a lexical cue, a string recognizable constructional cue, a formulae cue and a 
deviation cue. 
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22. (Previously Presented) The method of claim 19, wherein the second set of 
facets includes at least a one of a date facet, a narrative facet, a suasive facet, a fiction facet, a 
legal facet, a science and technical facet, and an author facet. 

23. (Previously Presented) The method of claim 19, wherein the third set of text 
genre types includes at least a one of a press report type, an Email type, an editorial opinion 
type, and a market analysis type. 

24. (Previously Presented) The method of claim 2 1 , wherein the second set of 
facets includes at least a one of a date facet, a narrative facet, a suasive facet, a fiction facet, a 
legal facet, a science and technical facet, and an author facet. 

25. (Previously Presented) The method of claim 24, wherein the third set of text 
genre types includes at least a one of a press report type, an Email type, an editorial opinion 
type, and a market analysis type. 

26. (Currently Amended) An article of manufacture comprising: 

a) a memory; and 

b) instructions stored in the memory for a method of identifying a 
document type of a document in machine-readable form without structurally analyzing the 
document text, the method being implemented by a processor coupled to the memory, the 
instructions comprising the steps of: 

1) selecting a first set of nonstructural surface cues; 

+2) generating a cue vector from the text, the cue vector having a valve for 
each of the selected cues and representing a frequency of occurrences in the text of a -the first 
set of nonstructuraL nonstructural surface cues; ~and 

3) associating a weighting vector with a first text genre; and 

24) determining whether the text is an instance of a -the first text genre 
using the cue vector and a rthe w eighting vector associated with the first text genre, 
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wherein the first set of cues includes a punctuational cue. 
27. (Currently Amended) An article of manufacture comprising: 

a) a memory; and 

b) instructions stored in the memory for a method of identifying a 
document type of a document in machine-readable form without structurally analyzing the 
document text, the method being implemented by a processor coupled to memory, the 
instructions comprising the steps of: 

P selecting a first set of nonstructural surface cues; 

4-2) generating a cue vector from the text, the cue vecto r having a valve for 
each of the selected cues and representing a frequency of occurrences in the text of a- the first 
set of nonstructuraL nonstructural surface cues , th e first s e t of cu e s including a punctuational 
cue, 

23) determining a relevancy to the text of each facet of a second set of 
facets using the cue vector and a weighting vector; and 

34) identifying from a third set of text genre types a text genre type of the 
text based upon those facets of the second set that are relevant to the text, 

wherein the first set of cues includes a punctuational cue. 



